Camera based interaction and instruction

ABSTRACT

Disclosed are methods and apparatus for instructing persons using computer based programs and/or remote instructors. One or more video cameras obtain images of the student or other participant. In addition images are analyzed by a computer to determine the locations or motions of one or more points on the student. This location data is fed to computer program which compares the motions to known desired movements, or alternatively provides such movement data to an instructor, typically located remotely, who can aid in analyzing student performance. The invention preferably is used with a substantially life-size display, such as a projection display can provide, in order to make the information displayed a realistic partner or instructor for the student. In addition, other applications are disclosed to sports training, dance, and remote dating.

CROSS REFERENCE TO RELATED APPLICATIONS

This application is a continuation of application Ser. No. 13/459,670,filed Apr. 30, 2012 (now U.S. Pat. No. ______), which is a continuationof application Ser. No. 12/891,480, filed Sep. 27, 2010 (now U.S. Pat.No. 8,189,053), which is a continuation of application Ser. No.11/376,158, filed Mar. 16, 2006 (now U.S. Pat. No. 7,804,530), which isa continuation of application Ser. No. 09/568,552, filed May 11, 2000(now U.S. Pat. No. 7,015,950), which claims the benefit of U.S.Provisional Application No. 60/133,671, filed May 11, 1999.

REFERENCES TO RELATED APPLICATIONS BY THE INVENTOR

This application is a related application of application Ser. No.09/435,854 filed Nov. 8, 1999, which was a continuation of applicationSer. No. 08/496,908 filed Jun. 29, 1995, now U.S. Pat. No. 5,982,352,which was a continuation-in-part of application Ser. No. 08/290,516,filed Aug. 15, 1994, now U.S. Pat. No. 6,008,000, which was acontinuation of application Ser. No. 07/946,588, filed Sep. 18, 1992,now abandoned.

This application is also a related application of co-pending applicationSer. No. 09/138,339 filed Aug. 21, 1998 and Provisional Patentapplication No. 60/142,777 filed Jul. 8, 1999.

The disclosures of the following U.S. patents and co-pending patentapplications by the inventor, or the inventor and his colleagues, areincorporated herein by reference:

1. U.S. Pat. No. 4,629,319 (Panel Surface Flaw inspection, whichdiscloses a novel optical principle commonly called “D Sight”), and U.S.Pat. Nos. 5,362,970, 5,880,459, 5,877,491, 5,734,172, and 5,670,787.

2. U.S. application Ser. No. 09/435,854 and U.S. Pat. No. 5,982,352, andU.S. Ser. No. 08/290,516 (“Man Machine Interfaces”), filed Aug. 15,1994, now U.S. Pat. No. 6,008,000, the disclosure of both of which iscontained in that of 09/435,854.

3. U.S. application Ser. No. 09/138,339, Useful man machine interfacesand applications.

4. U.S. application Ser. No. 09/433,297, More useful man machineinterfaces and applications.

Provisional Patent Applications

5. Camera Based Applications of Man-Machine Interfaces U.S. Ser. No.60/142,777.

6. Methods and Apparatus for Man Machine Interfaces and RelatedActivity, Ser. No. 60/133,673.

7. Tactile Touch Screens for Automobile Dashboards, Interiors and OtherApplications, Ser. No. 60/183,807, filed Feb. 22, 2000.

8. Apparel Manufacture and Distance Fashion Shopping in Both Present andFuture, Ser. No. 60/187,397, filed Mar. 7, 2000.

9. Weight Loss and Fashion Shopping, by Marie C. Pryor and Timothy R.Pryor, Ser. No. 60/187,396, filed Mar. 7, 2000.

The disclosures of the above referenced applications are incorporatedherein by reference.

INTRODUCTION

Method and apparatus is disclosed to enhance the quality and usefulnessof picture taking for pleasure, commercial, or other business purposes.In a preferred embodiment, stereo photogrammetry is combined withdigital image acquisition to acquire or store scenes and poses ofinterest, and/or to interact with the subject in order to provide datato or from a computer. Other preferred embodiments illustrateapplications to control of display systems.

BACKGROUND

Representative of USA patents on Digital cameras are U.S. Pat. Nos.5,534,921, 5,249,053 and many others which describe use of matrix array(CCD or otherwise) based cameras to take pictures of humans or otherobjects. The images taken are generally comprised of 400,000 or morepixels which are often compressed to smaller record sizes for datastorage, for later retrieval and display. Video cameras or Camcordersare also increasingly able to take still photographs as well, and recordor transmit them to computers.

Aside from exposure control (to keep the light reaching the detectorarray within the dynamic range of same), and range finding (to effectthe best lens focus given the object distance in question) there are fewcases known to the inventor where the camera taking the picture actuallydetermines some variable in the picture and uses it for the process ofobtaining the picture.

One such example that does not take a picture of humans but rather ofdata, is exemplified by U.S. Pat. No. 4,791,589, where a certain waveform signature on an oscilloscope is searched for by processing thedigital camera image, and when it is seen, the image stored.

More apropos the function of “Picture Taking” as the general publicknows it and of interest as the primary focus of the instant invention,is U.S. Pat. No. 5,781,650 by Lobo, et al which describes analysis afterthe fact of recorded images to determine facial content and thus the ageof the subject. This disclosure also alludes to a potential point andshoot capability also based on the age classification of the individualswhose picture is desired.

There is no known picture taking reference based on object position andorientation with respect to the camera, or other objects that I am awareof.

SUMMARY OF THE INVENTION

High Resolution Digital still cameras employing matrix photodetectorarray chips to scan the image produced by the camera lens are nowcommonplace, and will be even more so in a few years as chips andmemories become very inexpensive, and pixel density approaches 2000×2000pixels, rivaling photographic film. Even today Camcorders having 700×500pixel image chips are common for video based data and stills.

This invention is aimed at improvements in utilization of these camerasand others which make use of a computer based camera's ability toanalyze, in real time if desired, the images obtained. Indeed a picturetaking system may be composed of a combination of cameras, some used forpurposes other than the recording of the picture proper.

It is a goal of the invention to provide a method for taking pictureswhen certain poses of objects, sequences of poses, motions of objects,or any other states or relationships of objects are represented. It isalso a goal to allow this to be done in a self timer like mode, whendesired scene situations or specific dates or other circumstances exist.In some cases, information as to what is desired may be enteredremotely, even over the internet, or radio telephone.

It is also a goal of the invention to provide a method for selectingfrom a digital or other picture memory, pictures obtained when certainpre programmed poses of objects, sequences of poses, or relationships ofobjects are represented.

It is a further goal of the invention to provide means by which usersengaged in digital camera based activities, or other activities, using acomputer can have their pictures taken.

It is a still further goal to provide all such functions in a 2D or 3Dcontext, and using simple equipment capable of widespread use.

It is another goal of the invention to feed back data to a subject orsubjects having his or her, or their picture taken, in order that theyassume another pose or engage in another activity, or juxtaposition ofsubject positions.

While this invention is primarily aimed at the general picture takingpublic at large, it is realized that commercial photographers andcine-photographers, for example in the coming trend to digital“Hollywood” movie making, may benefit greatly from the invention herein,as it potentially allows more cost effective film production by givingthe director the ability to expose the camera to the presence of massesof data, but only saving or taking that data which is useful, and ifdesired, to signal the creation of further data based on data obtained.All this with little or no human intervention as desired, thus saving onthe cost of direction, film crews, and other labor or venue relatedcosts.

DRAWINGS DEPICTING PREFERRED EMBODIMENTS OF THE INVENTION

FIG. 1 illustrates means by which users engaged in digital camera basedactivities, or other activities, using a computer can have theirpictures taken.

FIGS. 2A-D illustrate a method for taking pictures when certain preprogrammed poses of objects, sequences of poses, or relationships ofobjects are represented.

FIG. 3 illustrates a self timer like mode, or when specific dates orother circumstances exist, including a system embodiment for takingpictures in shopping malls or other locales and providing instant printor other hardcopy capability (e.g. on a tee shirt).

FIG. 4 illustrates means to provide all such functions in a 2D or 3Dcontext, using simple equipment capable of widespread use. Variousretroreflective artificialtarget configurations are also disclosed.

FIG. 5 illustrates a method to feed back data to a subject having his orher picture taken, in order that the subject assume another pose orengage in another activity.

FIG. 6 illustrates a commercial version of the invention useful forpolice departments and real estate agents, among others.

FIG. 7 illustrates an embodiment of the invention used for photographyof stage performances.

FIG. 8 illustrates an embodiment of the invention used for balletinstruction and other teaching and interaction activities also withremotely located instructors or players.

EMBODIMENTS OF THE INVENTION FIG. 1

Illustrated in FIG. 1 of the invention is means by which users engagedin digital camera based activities, or other activities, using acomputer can have their pictures taken, and in this context, FIG. 1resembles that of co-pending referenced application 9 above. A singlecamera, or a set, such as a stereo pair are employed to see portions ofan object, such as a person, a part of a person such as a hand, leg,foot, fingers, or head, and/or to view datums on an object, portion ofan object, or an object held by the person or with which the personinteracts. In addition, multiple persons and objects can be seen.

Where a single camera is employed, 2D measurements of object locationrelative to the camera (x and y perpendicular to the camera axis) areall that is possible, unless datums of known shape or spacing are usedon the object viewed. Where a stereo pair or more of cameras areemployed, 3D (xyz) data of a single point can be provided, for exampleretro-reflector 50 on the head 52 of person 51. In both cases where 3 ormore datums are used on an object, 6 Degree of freedom data can beobtained, allowing object orientation in 3 angular axes as well as rangein 3 axes to be obtained. With two or more cameras, such 3D data mayalso be obtained using other features of objects such as edges of armsand the likely using known photogrammetric techniques.

The cameras used may also be used to take pictures of an object, oranother specialized camera used for that purpose in conjunction withthose used to determine the location of object features. Both examplesare illustrated in this application.

As shown in this figure, two cameras 101 and 102 are used as a stereopair, with each camera located at opposite sides of a TV monitor 105,used for either computer or Television display or both. This is adesirable configuration commercially and discussed the co-pendingapplication references above. In this particular case, an additionalcamera 110 is shown in the middle of the other two, said added cameraused for picture taking, internet telephony and/or other purposes. Anoptional auxiliary LED light source 115 (or 116 or 117) for illuminatinga user 60 or other object is also shown.

All three cameras are connected to the computer 130 by means of a USB(Universal Serial Bus) daisy chain, or IEEE 1394 firewire connections(faster). Each is accessed, as needed for position and orientationdetermination, or picture taking.

Even using a single camera in two dimensions (as is normal today), someposition and orientation data or sequences of same can be achieved usingmodern image processing techniques. (See for example the inventiondisclosed in U.S. Pat. No. 4,843,568 of Myron Krueger). However,accurate sensing and control of systems, such as cameras herein isdifficult today with processors cost effective enough to be used by thepublic at large, and artificial target augmentation of image points isoften desirable.

It is thus possible using the invention to be taking pictures of usersof interactive computer systems for whatever purpose. This allows one toautomatically capture images of children at play, for example with acomputer system such as a computer game. It also enables many otherfunctions which are described below. And it can be used in the field,where the computer, stereo position sensing and picture taking camera,may be co-located together in the same housing.

It is noted that where retro-reflectors are used, (as opposed tochoosing for example less contrasting datums, for example natural objectfeatures such as edges of fingers, or clothing features, or targets suchas colored dots) then each of the two cameras for stereo locationdetermination needs lights to illuminate retro-reflectors substantiallyco-located with the camera axes. These lights can alternatively providegeneral lighting for any other camera or cameras to use in takingphotographs or other purposes.

It is noted that cameras 101 and 102 need not have the image of theretro-reflector or other discernable target be in precise focus, indeedit is often helpful to have a some blur due to defocusing so as to aidsub pixel position solution of datum location. If the LEDs or otherlight sources are in the near infrared, and the camera lenses arefocused in the visible, this occurs naturally, unless the lens is alsonear infrared chromatic corrected.

An optional laser pointer (or other suitable illumination source),comprised of diode laser and collimating optics 150 is also usable withthe invention to illuminate object portions from which 3D data isdesired (such as the neck region of person 51 as shown), or in thesimpler case to designate which areas of a picture are to be focused, orzoomed in on or transmitted or recorded—with or without consideration of3-D position data of the object. This can be fixed as shown, oroptionally hand held by the user, for example in left hand (dottedlines) and used by him or her to designate the point to be measured in3D location. (see also references above). In addition a person takingpictures, such as a photography can without looking through theviewfinder of the camera, point to appoint on the subject, which is thendealt with by camera typically by focusing the lens system such that thepoint is in the desired state of focus (usually but not necessarily whenthe laser spot on the subject appears smallest in diameter and/or ofhighest contrast). Such as system is particularly useful for cameraswith wide fields of view, or those mounted on pan tilt mechanisms, wherethe mechanism can also be activated to position the camera axis to takethe picture with the laser spot for example centered in the camerafield.

In the laser designated case, it is generally the laser spot or otherindication on the surface that is imaged, (although one can alsoinstruct, for example using voice recognition software in computer 130inputted via voice activated microphone 135, the camera processor toobtain and store if desired the image of the area around the spotprojected onto the object as well or alternatively), and if the spot isdesired, it is often useful that cameras 101 and 102 have band-passfilters which pass the laser wavelength, and any led illuminationwavelengths used for retro-reflector illumination for example, but blockother wavelengths to the extent possible at low cost. It is noted thatthe discrimination in an image can also be made on color grounds—i.e.with red diode lasers and red LEDs, the system can analyze the imageareas containing reds in the image, for example—with the knowledge thatthe answer can't lie at any shorter wavelengths (e.g. green, yellow,blue).

By using two cameras 101 and 102, a superior ranging system for thelaser spot location on the subject results, since the baseline distance“BL” separating the cameras for triangulation based ranging purposes canbe sufficient to provide accurate measurement of distance to the object.

FIGS. 2A-D

As we begin to consider the apparatus of FIG. 1, it is clear one coulddo much more to enhance picture taking ability than hereto foredescribed and contained in the prior art. And it can be done withapparatus capable of field use.

FIGS. 2A-D for example, illustrates a method for taking pictures whencertain pre programmed or otherwise desired poses of objects, sequencesof poses, or relationships of objects are represented. No such abilityis available to photographers today.

Consider still camera system 201, patterned after that of FIG. 1 andcomprising 3 cameras and associated image scanning chips. The centralcamera, 202, is for picture taking and has high resolution and coloraccuracy. The two cameras on either side, 210 and 211, may be lowerresolution (allowing lower cost, and higher frame rate, as they haveless pixels to scan in a given frame time), with little or no accuratecolor capability, as they are used to simply see object positions orspecial datum positions on objects (which may be distinguished howeverby taught colors for example as taught in some of my co-pendinginventions).

Cost wise the distinction between cameras is important. Today low costCMOS chips and lenses capable of the providing stereo measurements asdescribed above are $15 or less. High quality CCD color detector arraysand lenses for high quality photo images are over $100, and in manycases $1000 or more.

An optical viewfinder 215 is one of many ways to indicate to the userwhat scene information is being gathered by the camera system. The usercan in this invention specify with a viewfinder based readout, the areaof the field that is desired. Use of the viewfinder in this manner,whether looked through or displayed on a screen, is for example analternative to designating an area on the actual object using a laserpointer for the purpose.

The camera system 201 further contains a computer 220 which processesthe data from cameras 210 and 211 to get various position and/ororientation data concerning a person (or other object, or personsplural, etc). Integral light sources as described in FIG. 1 above mayalso be provided such as LED arrays 240 and 245 and xenon flash 246.

In general, one can use the system to automatically “shoot” pictures forexample, when any or all of the following occur, as determined by theposition and orientation determining system of the camera of theinvention:

1. Subject in a certain pose.

2. Subject in a sequence of poses.

3. Portion of Subject in a sequence of poses (e.g. gestures).

4. Subject or portion(s) in a specific location or orientation.

5. Subject in position relative to another object or person. Forexample, this could be bride and groom kissing in a wedding, boy withrespect to cake on birthday, and sports events sequences of everydescription (where the camera can even track the object datums in thefield and if desired adjust shutter speed based on relative velocity ofcamera to subject).

6. Ditto all of above with respect to both persons in certain poses orgesture situations.

7. When a subject undertakes a particular signal comprising a positionor gesture—i.e. a silent command to take the picture (this could beprogrammed, for example, to correspond to raising one's right hand).

In addition it is noted that the invention acts as a rangefinder,finding range to the subject, and even to other subjects around thesubject, or to all parts of interest on an extensive subject. Thisallows a desired lens focus to be set based on any or all of this data,as desired. It also allows a sequence of pictures to be taken ofdifferent objects or object portions, at different focal depths, orfocus positions. The same holds true for exposure of these locations aswell.

It is also possible to use the above criteria for other purposes, suchas determining what to record (beyond the recording that is implicit intaking pictures), or in determining what to transmit. The latter isimportant vis a vis internet activity, where available internetcommunication bandwidth limits what can be transmitted (at least today).In this case video telephony with the invention comprehends obtainingonly those images you really care about in real time. So instead oftransmitting low resolution image data at 20 frames a second, you cantransmit say 5 (albeit asynchronously gathered) frames of highresolution preferred data. (This doesn't solve flicker problems, but itdoes mean that poor quality or extraneous material isn't sent!).Criteria such as degree of image motion blur or image focus can also beused in making transmission decisions.

FIG. 2 b illustrates a block diagram showing a pose analysis software orhardware module 250 analyzing processed image data (for exampleutilizing camera image data processed by visionbloks software fromIntegral Vision Corp.) from the computer 220 (which may be the samephysical microprocessor, such as a Intel Pentium 2 in a Dell inspiron3500 laptop computer, or different) and determining from same when acertain pose for example has been seen. When this occurs, a signal issent to the camera control module 255 to hold the last frame taken bycamera 202, and to display it to the photographer, digitally store it,or transmit it—to someone else, or another data store or display. Suchtransmission can be by data link, internet, cell phone, or any othersuitable means.

Another criteria could be that two or more preselected poses were seenone after the other, with a time delay between them, also pre-selectedif desired.

FIG. 2C illustrates a specific case whereby a point on one person, sayhand 260 of man 265 having head 271, is determined, and a picture istaken by camera system 201 of the invention when this point comes withina distance of approximately 6 inches (or any other desired amountincluding contact—i.e. zero distance) from another person or object, saythe head 270 of woman 275. To obtain the data, one can look for hand orhead indications in the image using known machine vision techniques,and/or in a more simple case put a target marker such as coloredtriangle 285 or other type on the hand or head or both and look for it.

The use of the natural features of the subjects heads, which aredistinguishable by shape and size in a known field containing twopersons, is now illustrated. For example, image morphology or templatematching in the image field of the solid state TV camera 202 can be usedto distinguish the head shapes from background data and data concerningthe rest of the features such as hands, etc. of subjects 265 and 275 (orconversely hand shapes if desired can be found and heads excluded, orthe hand of the right person, versus the head of the left, and soforth).

As shown in FIG. 2D, when the image field 287 of camera 202 afterprocessing contains the two head images, 290 and 291, spaced a distance“W”. When W is not within a tolerance D, the picture is not taken;whereas if the heads are close enough, within D as illustrated in dottedlines, the picture is taken.

Criteria as mentioned can include proximity of other parts of the body,or objects associated with the subjects (which themselves can beobjects). In addition, the motion or relative motion of objects can bethe criteria. For example, one could take program the device to take thepicture when on two successive frames the condition shown in FIG. 2Dexists where the heads are apart in frame 1, but closer in frame 2(probably corresponding to a movement say of the boy to kiss the girl).Clearly other sequences are possible as well, such as movement takingplace in several frames followed by a sequence of frames in which nomovement occurs. Other means to determine motion in front of the cameracan also be used in this context, such as ultrasonic sensors.

It is also noted that the actual position or movement desired can be“Taught” to the computer 220 of the picture taking system. For example,a boy and girl in a wedding could approach each other and kissbeforehand. The sequence of frames of this activity (a “gesture” ofsorts by both parties) is recorded, and the speed of approach, the headpositions and any other pertinent data determined. When the photographerthinks the picture is right, the computer of the camera system isinstructed to take the picture—for example it could be at the instantwhen after a suitable approach, two head images become joined intoone—easily recognizable with machine vision processing software underuniform background conditions. Then in the future, when such a conditionis reached in the camera field of view, pictures are taken and stored,or transmitted. This allows a camera to free run whose image field forexample takes in the head table at a wedding party, taking only theshots thought to be of most interest. Numerous conditions might beprogrammed in, or taught in—another at the same party, would be anyoneat the head table proposing a toast to the bride and groom, with arm andglass raised. If video is taken, it might be taken from the point atwhich the arm rises, until after it comes down. Or with suitable voicerecognition, when certain toast type words are heard, for example.

Application to “3-D” Pictures

Where it is desired to take “3-D” pictures, it can be appreciated thateach camera, 210 and 211 can take images of the scene in place of camera202, and that both cameras 210 and 211 outputs can be stored for laterpresentation in a 3D viewing context, using known display techniqueswith appropriate polarized glasses or switchable LCD goggles forexample. In this case the camera outputs can serve double duty ifdesired, each both recording picture data, as well as determiningposition of one or more points on the object or objects desired.

In addition, or alternatively, one can use in this 3D picture case, thecamera 202 (or even a stereo camera pair in place of 202) as a means fordetermining position and orientation independently from the stereopicture taking cameras.

If not used for immediate position information, camera 202 does not haveto be digital and could employ film or other media to recordinformation.

FIG. 3

In a manner resembling that of FIGS. 2A-D above, the invention can alsoserve to aid a person to take his or her own picture—a modern “Selftimer” if you will. For example any or all of the criteria such as theitems 1-7 above, can be used as criteria for the picture to be taken ofoneself. This is in addition to other more normal things like takingpictures after a certain time, or on a certain date or time interval,etc. This has particular appeal for taking pictures of one's self, or inany other situation where the photographer is not present (e.g.unattended recording of animals, children, etc.). Similarly, a handsignal or other signal to the camera can be used to trigger the pictureto be taken, using the computer camera combination to determine the handposition or movement. This can also be done by voice using microphoneinput and suitable voice recognition software in the computer.

Today, in a conventional context, one can as a photographer, choose toshoot a fashion model or other subject, and when you see a pose you likerecord the picture. But as one's own photographer, this is much moredifficult, unless you stream in video and search through the poses afterthe fact. But even then, you don't know that the poses were what wasdesired, as no feedback exists during the shoot.

With the invention, you may program the system to take only those poseswhich you think you want to get. And it can instruct the subject, when apicture is taken (and the lack thereof indicating to do somethingdifferent to obtain the desired effect resulting in a picture). Theeffect desired can be changed in midstream to adjust for changing wantsas well, by changing the program of the computer (which could be doneusing hardware switches, inserting a disc, or otherwise entered as acommand). In addition, as mentioned above, the gesture or pose desired,can be taught to the system, by first photographing a variety ofacceptable positions or sequences, and putting bounds on how close tothese will be accepted for photographing.

A specialized case is shown in FIG. 3, for self taking instant pictureor printout device for use in a shopping mall Kiosk or other venue. Inthis case two sweethearts 300 and 310 are on a bench 315 in front of thedigital or other camera 320. When the computer 330 detects fromprocessing the image (or images) of the invention that their faces arein close proximity (for example using the centroid of mass of their headas the position indicator, or even facial features such as described inthe Lobo et al patent reference), the computer then instructs the camerato record the picture. A push button or other selector on the deviceallows the subjects to select what criteria they want—for example whentheir heads are together for 5 seconds or more, or not together, orhands held, or whatever. Or when their faces are within a certaindistance criteria, such as one inch.

Alternatively, camera 320 may be a video camera and recorder whichstreams in hundreds or even thousands of frames of image data, and theselection of a group is made automatically by the invention in rapidfashion afterwards, with the subjects selecting their prints from thepre-selected (or taught as above) images as desired. Or the machineitself can make the final selection from the group, sort of as a randomslot machine for pictures so to speak, and print the picture usinginkjet printer 350 for example. Such a situation could be provided atless cost for example, with an incentive to add in your own criteria foran extra cost, and get pictures to choose from more along the linesdesired. Note that in addition to, or instead of prints, they could havemagnetic or other machine readable media to take home too.

FIG. 4

FIG. 4 illustrates means to provide all such functions in a 2D or 3Dcontext, using simple equipment capable of widespread use.

For example, the simplest case is to use the same single camera such as110, to both take the picture, and to determine location, according tothe invention, of one or more points on the object or objects forpurposes of controlling the picture taking, recording, or transmissionprocess in some way.

As has been disclosed in the aforementioned referenced co-pendingapplications, one can view using the single camera, one or more suchpoints in two dimensions, or in three dimensions under certainconditions when spaced points on the object have known spacing betweenthem on the surface of the object.

Identifying points from raw images is processing intensive, as isdetermination movement gestures of such images, such as an image of anarm or hand in a varying clothing and background situations. Butdetermining the location or movement of one or more artificial targetssuch as a colored retro-reflector is easy, accurate and fast, based onbrightness (under substantially coaxial illumination) and color—andpossibly shape as well if the target is of some distinguishable shape.

For example, consider retro-reflector (e.g. glass bead Scotchlight 7615tape by 3M company) 401, on the hand of a subject 404, theretro-reflector having a red reflection filter 405 matched to thewavelength of the LEDs 410 used with (and angularly positioned on ornear the axis 415 of) camera 420 comprising lens 421 and detector array422 used to take the picture of the object desired. When it is desiredto determine the position of the hand 404, the red LED's are turned onby camera controller 430, and a bright reflection is seen in the imageat the point in question due to the retro-reflection effect.

Where stereo pairs of cameras are used, as in FIG. 1 or 2A-D, tworeflections are seen whose disparity in location from one camera to theother gives the z distance (range direction) from the camera. In thiscase light sources are located with each camera of the stereo pair inorder that for each camera, the retro-reflectors are properlyilluminated with light emanating from point or points angularly near thecamera in question.

The LEDs can be illuminated on alternate camera frames, or at any othertime when “picture” type image data is not desired. In this case thecamera does not under room lights 445 say, normally see theretro-reflection signal, which is desirable as the bright spot of 401from the image of the human desired. Processor 450 processing the data,can even be used to subtract out from the recorded image, the shape ofthe retro-reflector, which might be a noticeably different shape thanfound in practice (e.g. a triangle). The image can be filled in wherethe subtraction occurred with color, brightness, contrast and texture orother characteristics of the surroundings. This is particularly easy ifthe target (retro-reflector or otherwise) is placed on the human orobject in a region of small variation in characteristics needed to befilled in, e.g. the back of one's hand, say. The key is that afterprocessing, the image look like it did without addition of theartificial target.

If the LEDs are turned on by the camera controller during picturetaking, color processing can be used to remove from the stored image ofthe scene, any indications of bright zones at the LED wavelength used,filling in with color of the surrounding area as desired.

Clearly both processing techniques just described or others can be used.And the methods work well with stereo pairs of cameras too.

Retro-reflective or other distinguishable artificial targets can beprovided in different decorative designs for wrist, back of hand, rings,forehead, hats, etc. For example, 3 targets in a heart or triangleshape, a square box of 4 targets, or a box or pyramid with line targetson its edges, and so forth.

Colored targets can be made of cloth, plastic, or the like, includingColored plaids, polka dots, etc. Or coatings or Filters or evaporated onfilters may be placed in front of a target such as a plasticretroreflector in order to render it of a given color (if it wasn't madeof colored material in the first place).

Decorative line outlines (also possible in retroreflective beadmaterial) can also be used as target datums, for example down the seamof glove fingers, or shoes, or belts, dress beading, etc.

FIG. 5

FIG. 5 illustrates further one of many methods by which the inventionmay be used to feed back data to a subject (or subjects) having his orher picture taken, in order that the subject assume another pose orengage in another activity.

For example consider FIG. 5. A girl 500 is having her picture taken bythe camera of the invention 501 (in this case a single digital cameraversion such as illustrated in FIG. 4), and her positions, orientationsor sequences of same, including motions between points are analyzed asdescribed above, in this case by computer 530. The computer has beenprogrammed to look for funny movements and positions, defined here aswhen the arms are in unusual positions (clearly a subjective issue,programmed as to tolerances, or taught to the system by the person incontrol of the situation).

The girl then poses for the camera. When the camera of the inventiontakes the picture according to its preprogrammed criteria (in this case,for example, defined as when her arms are over her head, and after asignificant movement has occurred), it lets her know by lighting light520 connected by wires not shown to computer 530. During the photoshoot, then she begins to learn what it is looking for (if she hasn'tbeen already told) and does more of the same. If desired, and optionalvideo display 540 or voice out put speaker 550, both connected tocomputer 530, indicate to her what is desired. This could also be aparticular type of pose, e.g. “Cheese-cake” based on historic classicalposes learned from photo art (note that she can also make comments forrecording too, with optional microphone input not shown. As pointed outabove, voice recognition software, such as IBM Via Voice” can be used torecognize commands from the subject or photographer, and cause otherresults).

It can be more sophisticated yet. For example, if the computer 530 andany associated software as needed may be used to analyze the model'slips and her smile. In this manner, the invention can be used tophotograph all “smiling” poses for example. Or poses where the smile iswithin certain boundaries of lip curvature even. Similarly, the cameraor cameras of the invention can be used, with suitable image analysissoftware to determine when the subject's eyes are open a certain amount,or facing the camera for example.

FIG. 3 above has alluded to possible use of the invention dataprocessing to determine position and/or orientation data from recordedpicture frames, after the picture is taken. A method for selecting frommemory pictures obtained when certain pre programmed poses of objectssequences of poses, or relationships of objects are represented.

Selection can be according to criteria for example 1-7 above, but thereare some differences. First if the data is taken normally from a singlecamera such as that of 202 above, 3D information is not available. Thisbeing the case, conventional 2D machine vision type image processing(e.g. “Vision Bloks” software from Integral Vision Corp.) can be used toextract object features and their locations in the images retained.

A second version alternatively could employ a single picture takingcamera, but by employing 3 dot or other suitable targets on thephotographed object in the camera field, could calculate 3D data relatedto the object (position and orientation in up to 6 axes can be socalculated by the computer of the invention using target location datain the camera image field).

A third version, records data from the camera, or in the case of theFIG. 2A-D device, all three cameras—all recorded for example on digitalmedia such that the processing can be done after the fact, just as itwould have been live.

Another application can be to monitor the relative change in successivepictures as seen by one or more relatively low resolution cameras andwhen such change is minimal, cue the high resolution camera requiring alonger exposure to become enabled. In this manner blur of the highresolution camera image is avoided. This is useful in taking pictures ofchildren, for example. This comparison of images can be made withoutactually measuring distances, but rather by looking for images which arenot different within an acceptance band, one to another, thus indicatingthe motion is largely stopped. This can be determined by subtracting oneimage from the other and determining the amount of pixels above athreshold. The more, the less the images are alike. Other techniques canbe used as well, such as correlation techniques.

In some instances it is desirable to have, in taking pictures, a displaysuch as 555, preferably (but not necessarily) life size. This displaycan be not only used to display the image 565 of the person whosepicture is being taken, but as well can display still (or video) imagescalled up from computer memory or other media storage such as DVD discs,and the like. One use of the displayed images is to indicate to thesubject a desired pose for example. This can be done by itself, orinteractively using the invention. A computer generated and rendered 3Dimage can also be created using suitable 3D solid modeling software(such as CAD KEY) to show an approximate pose to the model.

For example the invention disclosed above, allows one to automaticallyobserve the expressions, gestures and continence of a person, bydetermining the shape of their smile, the direction of eye gaze, and thepositions or motion of parts of the body such as the head, arms, hands,etc. Analysis using pre programmed algorithms or taught sequences canthen lead to a determination as to what information to display ondisplay 555 controlled in image content by display processor 560.

As one instance, suppose computer image analysis of data from camera 501of the invention has determined that the person 500 is not smilingenough, and is in too stationary a pose. A signal from computer 510 isprovided to display processor 560 so as to display on display 555 animage of someone (perhaps the same subject at an earlier time, or acomputer generated likeness of a subject) having the characteristicsdesired. The person looks at this display, and sees someone smiling morefor example, and in one scenario, tries to mimic the smile. And soforth. Alternatively, voice generation software, such as included in IBMVIAVOICE can be used to computer generate a voice command, “Smile More”for example, rather than show a visual illustration of the effectdesired.

FIG. 6

Let us now discuss some other applications of picture taking enabled bythe invention. One embodiment can be used to determine location of itemsin a scene, for example furniture in a house, for which homicide studiesor insurance fraud could be an issue (see also FIG. 1 above, as well asreferenced co-pending applications).

For example, a detective (whose arm 600 is shown) arrives at a murderscene in a room, and he sets the stereo camera 610 of the inventiondisclosed in FIG. 2 c on a tripod 620 (or other suitable location) andsystematically designates, using laser pointer 630, any object desired,such as chair 640 impacted by the laser beam at point P. Thecamera/computer system of the invention locates the designated pointtakes a picture of the room, or a portion thereof, including the zone ofthe designated point P which stands out in the picture due to the laserspot brightness. Optionally, the stereo pair of cameras of the inventioncan digitize rapidly the xyz coordinates of point p, which can besuperposed if desired on the image of the scene including point p itselfand its immediate surroundings. This data can be processed by computer660 as desired and either recorded or transmitted to a remote locationalong with the images as desired using known communication means. Thiswork can be done outdoors, as well as inside. Numerous points to bedigitized can be sensed and/or indicated, as desired.

The same digitization procedure can be used to digitize a room for areal estate person for example, to develop a data base on a house forsale. And many other such applications exist.

Finally it should be noted that the invention solves many famousproblems of picture taking, for example of children. The digital cameraimages of the invention can be processed for example using appropriatesoftware such as Vision Bloks to determine if the child's eyes are open(determined for example by recognizing the eye iris in the face area),and if so to take the picture, or after the fact, to select the picturefrom a group. Or a signal can be given by the system to the child to“open your eyes” so to speak. To determine if the eye is open, the imagecan be processed for example to look for the white of the eye, or tolook for red reflections from the eye. This can even be done with deepred, or near IR light sources like LEDs which do not bother the child.

Similarly, if the child (or other subject) is in motion, when you wanthim still, the picture can be analyzed until he is still, and then thepicture taken or selected. This can be determined from comparison ofsuccessive frames, from motion blur or other characteristics of motionin the image. Or a signal as above can be given to the child to “sitstill” (a famous command in picture taking annals).

FIG. 7

The invention can also be used for commercial photography and forproducing motion pictures. One advantage is that very high resolutionimages at suitable exposure levels of critical scenes can be taken, butnot too many which would overload the memory capacity of a camerasystem. A means to enhance this is now described.

It is noted that a camera having an ability to read individual pixels asdesired, or at least to choose the lines of pixels to be read, canachieve high rates of scan if one knows apriori where to look apriorifor data. Or if one say scans every 20th pixel in either direction xy ofthe camera, to determine where frame to frame changes are occurring (dueto change in pixel brightness or color). Once change is determined onecan often isolate those areas to the ones of interest. For example, evenin a “Still” picture, the head often moves (similar to the lovers on thebench in the shopping mall mentioned above). Every 20th pixel, cuts thenumber of pixels by 400 times, and raises a normal 30 hz scan rate toover 1000 scans per second—more than needed in many cases.

When the area of interest is found, the pixels in that area are allscanned for example.

Such pixel addressing cameras can also be used for determining theposition and change in position of features used to determine, andtrack, pose and other variables, as has also been discussed inco-pending applications, particularly Camera Based Man-MachineInterfaces U.S. Ser. No. 60/142,777, incorporated herein by reference.Of special interest is that same high resolution camera can be used totake the picture desired, while at the same time be used to find ortrack the object at high speed.

Such high speed tracking can be interspersed with the taking ofpictures. For example if in photographing a ballet, it may be desiredonly to take pictures of the prima ballerina, who typically is the one,with any male dancer, that is moving the most. By determining the zoneto be measured, one can sense quickly what zone should looked at, andhigh resolution photographs obtained from that zone. This allows one touse a very large format camera in a fixed location (e.g. 5000×5000pixels) to cover the image of the whole stage via suitable optics, butto only take and store the pixels in a 1000×700 zone of interestmovement, or positional or gesture interest for example, providing a 35times increase in the frame rate needed today with such large pixelcameras. This allows their practical use, without resort to humancameramen, or pan/tilt mechanisms.

Similar logic holds for quarterbacks in a football game, who often runfaster than any defense men around them and can be differentiatedaccordingly (along with any other issues such as uniform color, designor the like). If possible, it is desirable to have a clearly definedtarget, such as a retroreflective or bright colored target on one'shelmet for example. Indeed helmet color can be chosen accordingly.

This is illustrated in FIG. 7 wherein camera 701 composed of lens 705and an addressable version of a Kodak MegaPixel detector array 710having 4000×4000 elements and under the control of computer 711 is usedto scan the image of a pair of dancers 715 and 716 on stage 720. Thefield of view of the camera equal to area ab covers the whole stage. Butthe area scanned out from array 710 is confined to the region in whichthe dancers were last seen, which is defined as a zone a′b′ equal to inthis case 500×500 pixels. This still allows DVD type resolutions to beachieved, without pan or tilt of the camera. Similarly such techniquescan be used for video conferencing, sports, and other activities aswell.

It should be noted that in the above embodiments the words picture andphotograph are interchangeable, as are photographing or photography andpicture-taking. The camera used for same is preferably but notnecessarily a solid state TV camera whose pixels are scanned serially orrandomly under program command.

FIG. 8

The invention can also be used to sense positions of people forinstructional purposes. Data as to a dancers movements for example canbe obtained, and appropriate images, or data or both transmitted withoutexcessive bandwidth requirements to a remote location for comment orinteraction by a trained professional. Combined with life-size screendisplays this allows a life like training experience to be gained at lowcost, since one professional can watch 10 students in differentlocations say, each trying her movements alone in the interveningmoments. In addition such training can occur in the home, as if one hada private tutor or coach.

For example consider FIG. 8. A class of ballet students is practicingnear a “mirror” which in this case is comprised life size digitaldisplay screen 800 illuminated from the rear by a Sharp brand projector801 driven by computer 810. By sliding a real mirror in an out the minorcan be a minor, or a display. If desired, this display can be extensive,and for example using 3 projectors to cover 3 adjacent screens each 6feet high×9 feet long for example, such that a total length of a largestudio is comprised.

A master instructor 825 (possibly remotely located via the internet orother communication means) can observe the students via TV camera (orcameras). By viewing the students the instructor can make correctionsvia audio, or by calling up imagery which represents the appropriatemoves—for example from a professional doing the same Swan Lake number.In addition, the TV cameras of the invention can monitor the actuallocation and movements of the student, or students, and theirrelationship to each other, and if desired to various markers such as830 on the floor of the studio, placed there to assist in choreographingthe piece.

In addition, if the various gesture and position monitoring aspects ofthe invention are utilized as described above and in co-pendingapplications it is possible to have the instructions computer generatedusing dancers movements as input to a computer analysis program. This isparticularly useful if dance routines which are classical in nature, arebeing attempted, which have known best forms which can be computermodeled.

In another version, an assistant can be on the scene say working withten students in a local studio, while the master is remote.

It is also possible with the invention to provide input image data toprojector computer 810, even from remote internet located sources, whichrepresents other people dancing for example. These can be images of themaster, or others in the class—even if all in different locations. ORthe images can be those of others who have performed a particularroutine in the past, for example Dance of the Sugar plum fairy in theNutcracker. This imagery could be from the Bolshoi ballet performance ofthe same dance, displayed in small town ballet studio or home—toillustrate the moves required. The use of life size projection not onlygives a feel to this imagery, but further allows, I have discovered, aunique experience for the performer. Namely that the person can perform“with” the troupe displayed. In some cases, in ballet for example, thissometimes can be more useful than watching one's self in the minor(typical in ballet studios).

By using the cameras of the invention, such as stereo pair 850 and 851to determine student positions, it is also possible to control thedisplay in many ways. For example as the student got closer to thedisplay, the persons in the display could appear to come closer to thestudent. Conversely, it might be desirable to have them move away fromthe student to keep a constant apparent distance between them forexample. And if the student is twirling left, the figures in the balletdepicted on the screen can be caused to turn right (as they are “in theminor” so to speak) to match the movement of the student in approximateform at least.

In addition it is often desirable for learning purposes to Control speedof music and video display to match sensed movements of pupil, or fromremote master person. Use display techniques which can produce variablemotion display, such as variable speed DVD disc or read data in to ram.In addition it is desirable that overlaid could be masters voice.

The invention can be advantageously used in many performing arts, notjust ballet. For example, live theatre, where actors from Hamletperformances of the past can interact with those practicing. Or whereinstructors of Skating or Gymnastics, other activities can alsointeract.

Sports as well is amenable to the technique, but the size of the“studio” or gym becomes an issue. Basketball for example fits the spaceaspect of the projection screens and the fields of view of the inventioncameras as here described.

Ability of masters remotely located, and use of copyrighted performancematerial of famous performers and troupes allows one to franchise thestudio concept of the invention. For example each town could have aBolshoi studio franchise of this type.

It is noted that this same arrangement can serve other purposes beyondinstruction. One is the possibility of remote dating, in which sensedmovement of one partner is communicated, along with voice and visualexpression to the other. In addition, is possible, as disclosed inco-pending applications, to build the displays described above in theform of a touch screen in which contact of one partner with the displayof the other remotely transmitted from afar can occur.

If one uses large scale touch screens with optional added sensor inputs.As would be the ballet studio example of FIG. 8 if equipped with touchscreen capability, then one can provide a mechanism for marketing ofpeople relative (i.e. life size) objects such as automobiles infacilities such as Auto showrooms. Thus a ballet studio for example, canbe used for other purposes, not just instructional, but for selling carsfor example, where the display screen is displaying new models(including ones that are figments of design imagination, and wherecustomer input is desired as in a focus group) and where customer inputsvoice and action can be detected if desired by the invention. Or inreverse, a underused car showroom can be converted—on demand—into a sitewhich can be used for, among other things, instructional purposes inperforming arts, sports and the like. This gives a reason for being tothe show room that transcends selling cars, and helps attract people tothe facility. If a car was displayed, on a touch screen, one could walkup to the full size display of the car, and touch the door handle, whichwould cause the touch screen to sense that same had occurred, andindicate to the computer to cause the display to display the dooropening to expose the interior.

1. A portable device comprising: a device housing including a forwardfacing portion, the forward facing portion including an electro-opticalsensor having a field of view and a digital camera separate from theelectro-optical sensor; and a processing unit within the device housingand operatively coupled to electro-optical sensor, wherein theprocessing unit is adapted to control the digital camera in response toa gesture performed in the electro-optical sensor field of view.
 2. Theportable device of claim 1 wherein the gesture corresponds to an imagecapture command.
 3. The portable device of claim 1 wherein thedetermined gesture includes a hand motion.
 4. The portable device ofclaim 1 wherein the determined gesture includes a pose.
 5. The portabledevice of claim 1 wherein the electro-optical sensor is fixed inrelation to the digital camera.
 6. The portable device of claim 1further including a forward facing light source.
 7. The portable deviceof claim 1 wherein the electro-optical sensor defines a resolution lessthan a resolution defined by the digital camera.
 8. The portable deviceof claim 1 wherein the electro-optical sensor includes at least one of aCCD detector and a CMOS detector.
 9. A computer implemented methodcomprising: providing a portable device including a digital camera on aforward facing portion thereof, the digital camera defining a field ofview; determining, using a processing unit, a gesture performed in thedigital camera field of view; and capturing an image to the digitalcamera in response to the determined gesture corresponding to an imagecapture command.
 10. The method according to claim 9 wherein thedetermined gesture includes a hand motion.
 11. The method according toclaim 9 wherein the determined gesture includes a pose.
 12. The methodaccording to claim 9 further including providing a forward facingelectro-optical sensor and detecting, using the electro-optical sensor,the gesture performed in the digital camera field of view.
 13. Themethod according to claim 12 wherein the electro-optical sensor includesfirst and second sensors in fixed relation relative to the digitalcamera.
 14. The method according to claim 12 wherein the electro-opticalsensor defines a resolution less than a resolution defined by thedigital camera.
 15. An image capture device comprising: a digital cameraadapted to capture an image and having a field of view; a sensor adaptedto detect a gesture in the digital camera field of view; and aprocessing unit operatively coupled to the sensor and to the digitalcamera, wherein the processing unit is adapted to correlate a gesturedetected by the sensor with an image capture function and subsequentlycapture an image using the digital camera.
 16. The image capture deviceof claim 15 wherein the determined gesture includes a hand motion. 17.The image capture device of claim 15 wherein the determined gestureincludes a pose.
 18. The image capture device of claim 15 furtherincluding a forward facing light source.
 19. The image capture device ofclaim 15 wherein the sensor defines a resolution less than a resolutiondefined by the digital camera.
 20. The image capture device of claim 15wherein the sensor is fixed in relation to the digital camera.